{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 1.3 模型选择"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在之前多项式拟合的例子中，我们知道多项式的阶数 $M$ 的选择影响着我们拟合的效果，也决定了模型中参数的个数（模型复杂度）。我们还知道正则项参数 $\\lambda$ 也可以控制模型的复杂度。这些参数的选择是我们所关心的问题。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 验证集 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在最大似然的例子中我们看到，在训练集上的结果好不代表在测试的时候结果一定好，因为存在过拟合的现象。\n",
    "\n",
    "如果我们的数据足够多，我们可以从训练数据中分出一小部分数据作为验证集（`validation set`），在训练过程中对验证集进行测试，选择在验证集上效果最好的模型作为我们的结果。最后再用测试集（`test set`）对模型进行最终的评估。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 交叉验证"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在很多情况下，训练和测试数据都是有限的，为了得到更好的模型，我们希望使用尽可能多的训练数据。但是如果验证集的数据很少，那么在测试集上的测评结果就有很大的不确定性。\n",
    "\n",
    "一个解决这种问题的方法叫做交叉验证（cross-validation），我们将数据分成 $S$ 等份，每次使用 $S-1$ 份数据进行训练，剩下的一份作为验证集，重复 $S$ 次。下图是一个 $S = 4$ 的交叉验证的展示，粉红色的部分表示验证集。\n",
    "\n",
    "当训练数据特别少时，一个合适的方法是使用留一法（leave-one-out），令 $S = N$ 即与训练数据的总数相等。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZ8AAADuCAYAAADm6wpEAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAADcdJREFUeJzt3XFonPd9x/HPx7YWscykDkupS1ILnOzmhCEM/mMBZ7mE\njS2BbQl0+mNLig2mGSmLDQPvz7WdR0pGtmZdScIIaBDj4THaudDBWL0zDMYgwv5DYzm3XuTE7mri\nzIqV4AQsfffHc4aLuJPuznffR8/p/YIg67nn+d03+uPeep57JDkiBABApi1lDwAA2HyIDwAgHfEB\nAKQjPgCAdMQHAJCO+AAA0hEfAEA64gMASEd8AADptvWz88S2bTG9+5dGNcum0XzvoiSp9uCekicZ\nD81mU5JUq9VKnmQ88PUcvrm5uasRcU/Zc2wk7ufX62z/+Ttj6Z/PjHCczaF++DlJUuPcXMmTjId6\nvS5JajQapc4xLvh6Dp/tuYjYV/YcGwmX3QAA6YgPACAd8QEApCM+AIB0xAcAkI74AADSER8AQDri\nAwBIR3wAAOmIDwAgHfEBAKQjPgCAdMQHAJCO+AAA0hEfAEA64gMASEd8AAB9sV2z/Ze2/932x7bD\n9q/3swbxAQD062FJhyXdLWl+kAWIDwBUlO07bG8r4alPSdoREQ9KenWQBYgPAFSA7QOty1tP2n7R\n9iVJNyTda7veeqze4bgF27Ntn0+19j1me8b2vO1PbJ+3PdPLLBHxfxFx/Xb+f8ooJgBgcC+piM7L\nKl7DPxpwnSckHZT0mqRFSYcknbB9LiLOD2PQtRAfAKiWkLQ/Ij69tcH2IOvUJNUi4nJrjZOS3lUR\noaNDmHNNXHYDgGp5oz08t+HUrfBIUkRckfS2pN1DWHtdxAcAquXCkNa52GHbNRV3sI0c8QGAarnR\nYVussf/WLtuXu2wf6Bpev4gPAFTftdbHHe0bbU9K2pk/zvqIDwBU34Kkm5IeX7X9BXU/8ykVd7sB\nQMVFxHXbxyU97+LWt3kVv4XgEUlXh/18tu+S9EetT/e2Pv6B7V9t/fs7EfHhWmsQHwAYD0dUvKY/\nq+Kq1mlJj0k6M4Ln2iHpz1ZtO9D27zclER8AqLqImJU0u8bji5Ke6fDQ1Kr9FtTlpoKIqPc4S9c1\nesV7PgCAdMQHAJCO+AAA0hEfAEA64gMASEd8AADpiA8AIB3xAQCkIz4AgHTEBwCQjvgAANIRHwBA\nOuIDAEhHfAAA6YgPACBdX3/P5+cmJmLfV79yW3/DAYWVWFmxfbbsOcZJ8QccMSx8PYdqV9kDbDSO\niLJnAABsMlx2AwCkIz4AgHTEBwCQjvgAANIRHwBAOuIDAEhHfAAA6YgPACBdX7/hYGJiIqanp0c1\ny6bRbDal5RXV7uOHnoeh+d5FSVLtwT0lTzIems2mJKlWq5U8yfiYm5u7GhH3lD3HRtLXbzjYvn17\nLC0tjXCczaFer0uLS2q88nrZo4yF+uHnJEmNc3MlTzIe6vW6JKnRaJQ6xzixPRcR+8qeYyPhshsA\nIB3xAQCkIz4AgHTEBwCQjvgAANIRHwBAOuIDAEhHfAAA6YgPACAd8QEApCM+AIB0xAcAkI74AADS\nER8AQDriAwBIR3wAAOmIDwCgL7aftv2m7Z/YvmH7HdvHbd/f6xp9/RltAAAk/a2kq5L+QdJPJH1J\n0tckPWV7f0ScXW8B4gMAFWX7DknLEXEz+alnIuL0qln+XtI5SV+X9LvrLcBlNwCoANsHbIftJ22/\naPuSpBuS7rVdbz1W73Dcgu3Zts+nWvsesz1je972J7bP257pZZbV4Wlt+29J85Ie6mUNznwAoFpe\nUhGdl1W8hn804DpPSDoo6TVJi5IOSTph+1xEnO93MduW9AVJ7/ayP/EBgGoJSfsj4tNbG4rX/b7V\nJNUi4nJrjZMqwnFI0tEB1jso6YuSvtXLzlx2A4BqeaM9PLfh1K3wSFJEXJH0tqTd/S5ke6+k70j6\nT0mv9nIM8QGAarkwpHUudth2TdLd/SzSur36h5KuSHq615sfiA8AVMuNDttijf23dtm+3GV7z9fw\nbH9J0r+2nv83IuJ/ez2W93wAoPqutT7uaN9oe1LSzlE8oe2dkn4k6RckPRoRfZ2RceYDANW3IOmm\npMdXbX9B3c98Bmb7F1WE5/OSfjMi/qvfNTjzAYCKi4jrto9Ler51y/O8pIclPaLiNxEM279I2qPi\nNu09tvesmufN9RYgPgAwHo6oeE1/VsVVrdOSHpN0ZgTPtbf18Q9b/61GfABgHETErKTZNR5flPRM\nh4emVu23oC43FUREvcdZBvrBona85wMASEd8AADpiA8AIB3xAQCkIz4AgHTEBwCQjvgAANIRHwBA\nOuIDAEhHfAAA6YgPACAd8QEApCM+AIB0xAcAkI74AADSOSJ639l+X9LF0Y2zeex9oLZ3i7cQ/yFZ\niZWVsz9uni17DqCLXRFxT9lDbCR9xQcAgGHgO28AQDriAwBIR3wAAOmIDwAgHfEBAKQjPgCAdMQH\nAJBuWz87T0xMxPT09Khm2TSazaYkqVarlTzJeGg2m9Lyimr37Sp7lLHQfK/4OfLag3tKnmR8zM3N\nXeWHTD+rrx8y3b59eywtLY1wnM2hXq9LkhqNRqlzjIt6vS4tLqnxyutljzIW6oefkyQ1zs2VPMn4\nsD0XEfvKnmMj4bIbACAd8QEApCM+AIB0xAcAkI74AADSER8AQDriAwBIR3wAAOmIDwAgHfEBAKQj\nPgCAdMQHAJCO+AAA0hEfAEA64gMASEd8AADpiA8AIB3xAQD0xfbv2P6h7fdsf2L7iu1/s/3bva5B\nfAAA/foVSR9LelXS1yR9S9KEpFO2j/SywLbRzQYAGCXbd0hajoibmc8bEX/eYZa/ljQn6U8kfXu9\nNTjzAYAKsH3Adth+0vaLti9JuiHpXtv11mP1Dsct2J5t+3yqte8x2zO251uXzs7bnhl0vohYlnRJ\n0ud62Z8zHwColpdUROdlFa/hHw24zhOSDkp6TdKipEOSTtg+FxHne1nA9l0qLrfdLemp1po/6OVY\n4gMA1RKS9kfEp7c22B5knZqkWkRcbq1xUtK7KiJ0tMc1/knSo61/L0v6nqTnejmQ+ABAtbzRHp7b\ncOpWeCQpIq7YflvS7j7W+GMVZz1flPR7Ks6C7pT0wXoHEh8AqJYLQ1rnYodt11TEpCcRMdf26d/Z\n/r6k07YfWi+Q3HAAANVyo8O2WGP/rV22L3fZPtA1vJYTKs6cfm29HYkPAFTftdbHHe0bbU9K2pk4\nx2SnOTohPgBQfQuSbkp6fNX2F9T9zGdgtj/fYdtWFTcrhKS31luD93wAoOIi4rrt45Ked3Hr27yk\nhyU9IunqCJ5y3vYZSWclXVFxw8HvS/plSX8REf+z3gLEBwDGwxEVr+nPqriqdVrSY5LOjOC5vivp\nt1rr3yXpuooQ/WlEnOxlAeIDABUQEbOSZtd4fFHSMx0emlq134K63FQQEfUeZ/mGpG/0sm83vOcD\nAEhHfAAA6YgPACAd8QEApCM+AIB0xAcAkI74AADSER8AQDriAwBIR3wAAOmIDwAgHfEBAKQjPgCA\ndMQHAJCO+AAA0jkiet/Zfl/SxdGNAwxm7wO1vVu8hW+mhmQlVlbO/rh5tuw5xsiuiLin7CE2kr7i\nAwDAMPCdIgAgHfEBAKQjPgCAdMQHAJCO+AAA0hEfAEA64gMASLetn50nJiZienp6VLNsGs1mU5JU\nq9VKnmQ88PUcrmazKS2vqHbfrrJHGRvv/Oyn8cGHi3yz36av+ExOTuqtt94a1SybRr1elyQ1Go1S\n5xgXfD2Hq16vS4tLarzyetmjjI19X/2Ky55ho6HEAIB0xAcAkI74AADSER8AQDriAwBIR3wAAOmI\nDwAgHfEBAKQjPgCAdMQHAJCO+AAA0hEfAEA64gMASEd8AADpiA8AIB3xAQCkIz4AgHTEBwBwW2x/\n03bYvtTrMcQHADAw2/dLOirp436O2zaacQAAo2b7DknLEXGzxDH+RlJD0qSk+3s9iDMfAKgA2wda\nl7aetP1i6xLXDUn32q63Hqt3OG7B9mzb51OtfY/ZnrE9b/sT2+dtz/Q505clPS7pcL//P5z5AEC1\nvKQiOi+reA3/aMB1npB0UNJrkhYlHZJ0wva5iDi/3sG275T0V5K+HRFN2309OfEBgGoJSfsj4tNb\nG/p94W+pSapFxOXWGiclvasiQkd7OP7rKq6efXOQJyc+AFAtb7SH5zacuhUeSYqIK7bflrR7vQNt\nPyTpiKQDETHQmRfv+QBAtVwY0joXO2y7JunuHo79rqT/iIjjgz45Zz4AUC03OmyLNfbf2mX7cpft\na17Ds/20pEclPWV7qu2hSUlbW9s+joj311qH+ABA9V1rfdzRvtH2pKSdQ36uXa2P3+/y+DuS/lHS\nl9dahPgAQPUtSLqp4rbn77Vtf0Hdz3wG9YPW8612TNIXVNywcLnD459BfACg4iLiuu3jkp53cevb\nvKSHJT0i6eqQn+uCOrzvZPuIpM9FRLczos8gPgAwHo6oeE1/VsXNZKclPSbpTJlDdUN8AKACImJW\n0uwajy9KeqbDQ1Or9ltQl5sKIqI+4Hh9H8ut1gCAdMQHAJCO+AAA0hEfAEA64gMASEd8AADpiA8A\nIB3xAQCkIz4AgHTEBwCQjvgAANIRHwBAOuIDAEhHfAAA6Ryx1p/+XrWz/b6ki6MbB8BGsPeB2t4t\n3sI3p0Pyzs9+Gh98uMjXs01f8QEAYBgoMQAgHfEBAKQjPgCAdMQHAJCO+AAA0hEfAEA64gMASEd8\nAADpiA8AIN3/A9OnA1C5/eSiAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x7f0dab4875f8>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import numpy as np\n",
    "import scipy as sp\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "%matplotlib inline\n",
    "\n",
    "fig, axes = plt.subplots(4, 1)\n",
    "axes = axes.flatten()\n",
    "\n",
    "for i, ax in enumerate(axes):\n",
    "    for j in range(4):\n",
    "        ax.plot([j,j], [0,1], color='black')\n",
    "    ax.set_xlim(0,4)\n",
    "    ax.set_xticks([])\n",
    "    ax.set_ylim(0,1)\n",
    "    ax.set_yticks([])\n",
    "    ax.fill_between(np.linspace(i,i+1), 1, color=\"pink\")\n",
    "    ax.text(4.2, 0.35, \"run {}\".format(i+1), fontsize=\"xx-large\")\n",
    "    \n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 信息量准则"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "交叉验证的一个主要缺点是训练次数会随着 $S$ 的增加而加大，对于一些计算很耗时的模型来说不是很适用。另一个深入的问题是对于每次的训练，我们可能会使用不同的超参来得到最好的效果。\n",
    "\n",
    "因此，我们需要一个更好的方法来决定模型的复杂度。\n",
    "\n",
    "历史上有一些信息量准则被人提出，用来衡量模型的复杂度，例如 `AIC (Akaike, 1974)`（`Akaike information criterion`）最大化：\n",
    "\n",
    "$$\n",
    "\\ln p(\\mathcal D|\\mathbf w_{ML}) - M\n",
    "$$\n",
    "\n",
    "其中 $M$ 是模型中可变参数的数目。类似的准则还有 `BIC（Bayesian information criterion）` 等。这些准则没有考虑到模型参数的不确定性，但是在一些简单的模型上效果很好。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
